Network histograms and universality of blockmodel approximation
نویسندگان
چکیده
In this paper we introduce the network histogram, a statistical summary of network interactions to be used as a tool for exploratory data analysis. A network histogram is obtained by fitting a stochastic blockmodel to a single observation of a network dataset. Blocks of edges play the role of histogram bins and community sizes that of histogram bandwidths or bin sizes. Just as standard histograms allow for varying bandwidths, different blockmodel estimates can all be considered valid representations of an underlying probability model, subject to bandwidth constraints. Here we provide methods for automatic bandwidth selection, by which the network histogram approximates the generating mechanism that gives rise to exchangeable random graphs. This makes the blockmodel a universal network representation for unlabeled graphs. With this insight, we discuss the interpretation of network communities in light of the fact that many different community assignments can all give an equally valid representation of such a network. To demonstrate the fidelity-versus-interpretability tradeoff inherent in considering different numbers and sizes of communities, we analyze two publicly available networks--political weblogs and student friendships--and discuss how to interpret the network histogram when additional information related to node and edge labeling is present.
منابع مشابه
Stochastic blockmodel approximation of a graphon: Theory and consistent estimation
Non-parametric approaches for analyzing network data based on exchangeable graph models (ExGM) have recently gained interest. The key object that defines an ExGM is often referred to as a graphon. This non-parametric perspective on network modeling poses challenging questions on how to make inference on the graphon underlying observed network data. In this paper, we propose a computationally ef...
متن کاملAsymptotic Normality of Maximum Likelihood and its Variational Approximation for Stochastic Blockmodels
Variational methods for parameter estimation are an active research area, potentially offering computationally tractable heuristics with theoretical performance bounds. We build on recent work that applies such methods to network data, and establish asymptotic normality rates for parameter estimates of stochastic blockmodel data, by either maximum likelihood or variational estimation. The resul...
متن کاملBeyond Worst-Case (In)approximability of Nonsubmodular Influence Maximization
We consider the problem of maximizing the spread of influence in a social network by choosing a fixed number of initial seeds, formally referred to as the influence maximization problem. It admits a (1 − 1/e)-factor approximation algorithm if the influence function is submodular. Otherwise, in the worst case, the problem is NP-hard to approximate to within a factor of N1−ε. This paper studies w...
متن کاملUniversality of Serial Histograms
Many current relational database systems use some form of histograms to approximate the frequency distribution of values in the attributes of relations and based on them estimate query result sizes and access plan costs. The errors that exist in the histogram approximations directly or transitively affect many estimates derived by the database system. We identify the class of serial histograms ...
متن کاملA state-space mixed membership blockmodel for dynamic network tomography
In a dynamic social or biological environment, the interactions between the underlying actors can undergo large and systematic changes. The latent roles or membership of the actors as determined by these dynamic links will also exhibit rich temporal phenomena, assuming a distinct role at one point while leaning more towards a second role at an another point. To capture this dynamic mixed member...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 111 41 شماره
صفحات -
تاریخ انتشار 2014